Microblog Sentiment Analysis Based on Paragraph Vectors

نویسندگان

  • Chengcheng Hu
  • Xuliang Song
چکیده

Microblog sentiment analysis aims at discovering the users’ attitude of hot events. Difficulties of microblog sentiment analysis lie on the short length of text and lack of labeled corpora. Para2vec based on deep learning attracts people's attention recently and the low-dimensional paragraph vectors trained by para2vec get excellent results on sentiment analysis. But when applying it for sentiment analysis on microblogs, we find it does not work so well as on ordinary texts. In this paper, we analyse the weakness of microblog sentiment analysis based on paragraph vectors. And then, we propose two categories of methods, model extension and emotional tendency vectors, to improve the model para2vec. The experimental results confirmed the rationality of our methods. Data analysis shows that our improved methods can effectively reduce the adverse effects of the short text and greatly improve the accuracy of sentiment analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Model for Chinese Microblog Sentiment Analysis

Topic-based sentiment analysis for Chinese microblog aims to identify the user attitude on specified topics. In this paper, we propose a joint model by incorporating Support Vector Machines (SVM) and deep neural network to improve the performance of sentiment analysis. Firstly, a SVM Classifier is constructed using N-gram, NPOS and sentiment lexicons features. Meanwhile, a convolutional neural ...

متن کامل

Bayesian Paragraph Vectors

Word2vec (Mikolov et al., 2013b) has proven to be successful in natural language processing by capturing the semantic relationships between different words. Built on top of single-word embeddings, paragraph vectors (Le and Mikolov, 2014) find fixed-length representations for pieces of text with arbitrary lengths, such as documents, paragraphs, and sentences. In this work, we propose a novel int...

متن کامل

Document Embedding with Paragraph Vectors

Paragraph Vectors has been recently proposed as an unsupervised method for learning distributed representations for pieces of texts. In their work, the authors showed that the method can learn an embedding of movie review texts which can be leveraged for sentiment analysis. That proof of concept, while encouraging, was rather narrow. Here we consider tasks other than sentiment analysis, provide...

متن کامل

NDMSCS: A Topic-Based Chinese Microblog Polarity Classification System

In this paper, we focus on topic-based microblog sentiment classification task that classify the microblog’s sentiment polarities toward a specific topic. Most of the existing approaches for sentiment analysis usually adopt the target-independent strategy, which may assign irrelevant sentiments to the given topic. In this paper, we leverage the non-negative matrix factorization to get the relev...

متن کامل

Personalized Microblog Sentiment Classification via Multi-Task Learning

Microblog sentiment classification is an interesting and important research topic with wide applications. Traditional microblog sentiment classification methods usually use a single model to classify the messages from different users and omit individuality. However, microblogging users frequently embed their personal character, opinion bias and language habits into their messages, and the same ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JCP

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2016